The slides (as HTML), data, and code for this talk are all available on GitHub:
https://github.com/geanders/GuestLectures/tree/master/soars_tutorial
3/8/2017
The slides (as HTML), data, and code for this talk are all available on GitHub:
https://github.com/geanders/GuestLectures/tree/master/soars_tutorial
The htmlWidgets family of packages wrap Javascript visualization libraries. In practice, that means that you can create interactive visualizations from R in which code run by the viewer's web browser allows the interaction.
These packages can be used to create interactive visualizations that can be added to several outputs:
bookdown and blogdown, respectively)For example data, I've pulled listings from NOAA's Storm Events Database. I've pulled all listings of flood and tornado events in Colorado between 2013 and 2015. (Note: I used the noaastormevents package, with development lead by Ziyu Chen, to do this directly from R.)
I've created two data sets, co_floods and co_tornados. Each row represents an event. Columns include the event's date, location, damage, and a few other details.
load("data/co_floods.Rdata")
colnames(co_floods)
## [1] "begin_date" "end_date" "event_type" ## [4] "fips" "cz_name" "deaths_direct" ## [7] "injuries_direct" "damage_property" "damage_crops" ## [10] "source" "begin_lat" "begin_lon" ## [13] "end_lat" "end_lon" "flood_cause" ## [16] "episode_narrative" "event_narrative"
co_floods %>% select(begin_date, event_type, cz_name, fips, begin_lat, begin_lon) %>% slice(1:3)
## # A tibble: 3 × 6 ## begin_date event_type cz_name fips begin_lat begin_lon ## <date> <chr> <chr> <dbl> <dbl> <dbl> ## 1 2013-08-10 Flash Flood Fremont 8043 38.4858 -105.3768 ## 2 2013-09-13 Flood Pueblo 8101 38.4439 -104.5983 ## 3 2013-09-13 Flood Pueblo 8101 38.2708 -104.4557
One of the variables gives longer text descriptions of the event (and can be missing):
co_floods %>% select(begin_date, event_narrative) %>% sample_n(4) %>% pander(split.cells = c(5, 50))
| begin_date | event_narrative |
|---|---|
| 2013-09-11 | Flash flooding was observed in the High Park burn area. It forced the closure of CO14, between Rustic and Teds Place. Other road closures included Larimer County Road 70 between County Roads 20 and 21. |
| 2013-09-14 | The combination of heavy rain, coupled with extremely saturated ground conditions, produced additional flash flooding. |
| 2013-09-12 | Six to eight inches of water was reported flowing over Highway 385 at County Road C. A semi truck almost lost control driving through the flooded section of road. |
| 2014-07-19 | Heavy rain on the Waldo Canyon burn scar caused flooding on US Highway 24, along with mud flows. The flooding extended down to Waldo Canyon and into Manitou Springs. |
It's pretty straightforward to use R functions to create a static map with this data:
library(ggplot2)
map_data("county", region = "colorado") %>%
ggplot(aes(x = long, y = lat, group = group)) +
geom_polygon(color = "darkgray", fill = "white") +
geom_point(data = co_floods, aes(x = begin_lon, y = begin_lat, group = NULL),
color = "blue", alpha = 0.5) +
coord_map() + theme_void()
However, we might prefer sometimes to create something interactive:
The three basic components of a leaflet map are:
co_floods %>% leaflet() %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addCircleMarkers(lng = ~ begin_lon, lat = ~begin_lat, radius = 3, color = "blue")
With addProviderTiles, you can pick from many different background map tiles. Visit this provider preview page to see options and get the provider names to use in the R call. Here is the same map using different map tiles:
co_floods %>% leaflet() %>%
addProviderTiles("NASAGIBS.ViirsEarthAtNight2012") %>%
addProviderTiles("CartoDB.DarkMatterOnlyLabels") %>%
addCircleMarkers(lng = ~ begin_lon, lat = ~ begin_lat, radius = 3, color = "green")
There are various things you can add to the map to map data to locations. Several are listed here, and all can be add using add plus the item name (for example, addMarkers).
Markers: "Push pin" style markersCircleMarkers: The circle markers shown in examples so farLabelOnlyMarkersPolylinesCirclesRectanglesPolygonsYou can layer several of these on the same leaflet map (e.g., roads with Polylines, counties with Polygons, exact locations with one of the markers).
Data can be mapped from a dataframe with columns for longitude and latitude. In this case, the names of the correct columns for longitude and latitude should be specified with the lng and lat arguments:
leaflet() %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addCircleMarkers(data = co_floods, lng = ~ begin_lon, ~ begin_lat,
radius = 3, color = "blue") %>%
addMarkers(data = co_tornadoes, lng = ~ begin_lon, ~ begin_lat)
Alternatively, if you have data saved as a spatial object, you can map that directly without specifying latitude and longitude columns.
The tigris package allows you to pull US Census TIGER shapefiles directly from R. The following call pulls a shape file with Colorado county boundaries (the cb options is so we pull a lower-resolution version):
library(tigris) co_counties <- counties(state = 'CO', cb = TRUE) class(co_counties)
## [1] "SpatialPolygonsDataFrame" ## attr(,"package") ## [1] "sp"
This spatial object can be used when mapping data to location with the data option:
leaflet() %>%
addProviderTiles("Stamen.TonerBackground") %>%
addPolygons(data = co_counties) %>%
addCircleMarkers(data = co_floods, lng = ~ begin_lon, ~ begin_lat,
radius = 3, color = "green")
You can use a custom icon for the map markers. For example, this tornado icon was created by Gilad Fried and is under a Creative Commons license. You can use the following code to use it to mark the Colorado tornadoes (this assumes it's been saved locally as "figures/tornado.png"):
co_tornadoes %>%
leaflet() %>% addProviderTiles("OpenStreetMap.Mapnik") %>%
addMarkers(~ begin_lon, ~ begin_lat,
icon = makeIcon("figures/tornado.png", iconWidth = 20, iconHeight = 20))
In cases where a leaflet map has a lot of points, it can be hard to interpret until you zoom in. In this case, it often helps to use markerClusterOptions() to create cluster markers until the map is zoomed in.
leaflet() %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addCircleMarkers(data = co_floods, lng = ~ begin_lon, ~ begin_lat,
radius = 3, color = "blue",
clusterOptions = markerClusterOptions())
For any of these data mappings, you can add "pop-ups" to show information when a person clicks on a marker or shape. To do this, specify either a column from the dataframe you're mapping or a vector of the same length for the popup option. For example, the following call uses the beginning date of each flood in the pop-ups:
leaflet() %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addCircleMarkers(data = co_floods, lng = ~ begin_lon, ~ begin_lat, popup = ~ begin_date,
radius = 3, color = "green")
Shape files will often include some information in the data slot that might be useful in a pop-up. For example, the data for co_counties includes county name:
head(co_counties@data)
## STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD ## 27 08 013 00198122 0500000US08013 08013 Boulder 06 ## 28 08 029 00198130 0500000US08029 08029 Delta 06 ## 29 08 059 00198145 0500000US08059 08059 Jefferson 06 ## 30 08 091 00198161 0500000US08091 08091 Ouray 06 ## 339 08 019 00198125 0500000US08019 08019 Clear Creek 06 ## 340 08 023 00198127 0500000US08023 08023 Costilla 06 ## ALAND AWATER ## 27 1881212055 36592000 ## 28 2958007403 16886462 ## 29 1979311263 25444831 ## 30 1402657135 1599543 ## 339 1023554877 3279667 ## 340 3177806137 8828906
You can reference values in that data slot when mapping data to locations from data stored in a spatial object:
leaflet() %>%
addProviderTiles("Stamen.TonerBackground") %>%
addPolygons(data = co_counties, popup = co_counties@data$NAME)
Often, it can be helpful to paste together information from several columns of the dataframe to include in the pop-up:
co_floods %>%
leaflet() %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addCircleMarkers(data = co_floods, lng = ~ begin_lon, ~ begin_lat,
popup = ~ paste("Date:", begin_date, "to", end_date),
radius = 3, color = "green")
If you want to get even fancier, you can include HTML tags to style the text in the pop-ups:
co_floods <- co_floods %>%
mutate(popup_text = paste0("<div class='leaflet-popup-scrolled' style='max-height:150px'>",
"<b>County: </b>", cz_name, "<br/>",
"<b>Dates: </b>", begin_date, " to ", end_date, "<br/>",
"<b># deaths: </b>", deaths_direct, "<br/>",
"<b># injuries: </b>", injuries_direct, "<br/>",
"<b>Property damage: </b>$", damage_property, "<br/>",
"<b>Crop damage: </b>$", damage_crops, "<br/>",
event_narrative))
Here is the map using these fancier pop-ups:
co_floods %>%
leaflet() %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addCircleMarkers(data = co_floods, lng = ~ begin_lon, ~ begin_lat, popup = ~ popup_text,
radius = 3, color = "green")
By default, the map will initial zoom to a point that bounds all the mappings. If you want to customize where and how much the map initially zooms, you can do that with setView. For example, this call sets the initial map to show the Fort Collins area rather than all of Colorado:
co_floods %>%
leaflet() %>% setView(lng = -105.0844, lat = 40.5853, zoom = 9) %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addCircleMarkers(data = co_floods, lng = ~ begin_lon, ~ begin_lat, popup = ~ popup_text,
radius = 3, color = "blue")
If you set a tighter zoom like this, but also have data for a wider area, you may want to include a mini-map to help users navigate the map. You can do this with addMinimap:
co_floods %>%
leaflet() %>% setView(lng = -105.0844, lat = 40.5853, zoom = 9) %>%
addMiniMap(position = "topright") %>%
addProviderTiles("OpenStreetMap.Mapnik") %>%
addCircleMarkers(data = co_floods, lng = ~ begin_lon, ~ begin_lat, popup = ~ popup_text,
radius = 3, color = "blue")
plotly libraryThe plotly library is another library in the htmlWidgets family. It is more general-purpose, with functions for creating lots of different types of interactive plots.
One particular appeal is that it can be used to wrap ggplot objects, to create interactive visualizations very efficiently from code you already have to create static plots.
For example, you can create a static time series of number of flood events by date in Colorado using this code:
flood_ts <- co_floods %>% count(begin_date) %>%
full_join(data_frame(begin_date = seq(ymd("2013-01-01"),
ymd("2015-12-31"), by = 1))) %>%
mutate(n = ifelse(is.na(n), 0, n)) %>% arrange(begin_date) %>%
ggplot(aes(x = begin_date, y = n)) +
geom_line() + theme_classic() +
labs(x = "Date", y = "# of flood events\nin Colorado") +
facet_wrap(~ year(begin_date), ncol = 1, scales = "free_x")
flood_ts
Then you can use ggplotly to transform that ggplot object to an interactive graphic:
library(plotly) ggplotly(flood_ts)
You can also create plotly graphics "from scratch". This uses a similar piping method as ggplot2.
As an example, we might want to figure out:
The choroplethr package includes a data set with US county populations (df_pop_county). (You could also pull this through acs or something similar, but you'd need an API key.)
data(df_pop_county, package = "choroplethr") head(df_pop_county)
## region value ## 1 1001 54590 ## 2 1003 183226 ## 3 1005 27469 ## 4 1007 22769 ## 5 1009 57466 ## 6 1011 10779
Property damages are in a weird format:
head(co_tornadoes$damage_property, 20)
## [1] "0.00K" "0.00K" "5.00K" "0.00K" "0.00K" "0.00K" "0.00K" "0.00K" ## [9] "0.00K" "0.00K" "0.00K" "0.00K" "0.00K" "0.00K" "0.00K" "0.00K" ## [17] "0.00K" "0.00K" "0.00K" "0.00K"
But there's a function in noaastormevents we can use to parse those to numeric values:
head(noaastormevents::parse_damage(co_tornadoes$damage_property), 20)
## [1] 0 0 5000 0 0 0 0 0 0 0 0 0 0 0 ## [15] 0 0 0 0 0 0
You can count the number of tornadoes in each county and join with this population data:
county_tornadoes <- co_tornadoes %>%
mutate(damage_property = noaastormevents::parse_damage(damage_property)) %>%
group_by(cz_name) %>%
summarize(fips = first(fips),
tornado_count = n(),
damage_property = sum(damage_property)) %>%
right_join(df_pop_county %>% mutate(is_co = str_detect(region, "^8")) %>%
filter(is_co) %>% rename(population = value) %>% select(-is_co),
by = c("fips" = "region")) %>%
mutate(tornado_count = ifelse(is.na(tornado_count), 0, tornado_count),
damage_property = ifelse(is.na(damage_property), 1, damage_property + 1))
head(county_tornadoes, 5)
## # A tibble: 5 × 5 ## cz_name fips tornado_count damage_property population ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 Adams 8001 12 15001 442996 ## 2 Alamosa 8003 2 1 15750 ## 3 Arapahoe 8005 4 1 574357 ## 4 <NA> 8007 0 1 12109 ## 5 Baca 8009 6 3001 3783
You can use piping to create a plotly object, map attributes of the plot to elements of the data, and then add and change elements of the object (add markers, adjust axes, etc.).
co_plot <- county_tornadoes %>%
plot_ly(x = ~ population, y = ~ tornado_count) %>%
add_markers(color = ~ log10(damage_property),
alpha = 0.6,
text = ~ paste0(cz_name, " (FIPS: ", fips, ")"),
hoverinfo = c("x", "y", "text")) %>%
colorbar(title = "Log of property damage") %>%
layout(title = "Colorado tornadoes by county",
xaxis = list(title = "Population", showgrid = F, type = "log"),
yaxis = list(title = "# of tornados (2013-2015)", showgrid = F))
co_plot
You can also create 3-D scatter plots with plotly:
county_tornadoes %>%
plot_ly(x = ~ log10(population), y = ~ tornado_count,
z = ~ log10(damage_property)) %>%
add_markers(size = I(4), text = ~ paste0(cz_name, " (FIPS: ", fips, ")"),
hoverinfo = c("x", "y", "text"))
Shiny allows you to power web applications with R code run on a Server. The interactive graphics created with htmlWidgets, on the other hand, are interactive through Javascript code run on the viewer's web browser.
Image source: http://mi-linux.wlv.ac.uk/
This means that htmlWidgets graphs can be viewed without creating something linked to a Shiny server. (It also can mean that the data behind the graphic is passed to the viewers, so be careful if using sensitive data.)
Many of the packages in the htmlWidgets family were developed at RStudio. Both the overall documentation for htmlWidgets and documentation for specific packages in the family are typically exceptional.
These sources were used in developing these slides and are also excellent references for finding out more:
leaflet package: http://rstudio.github.io/leaflet/plotly book: https://cpsievert.github.io/plotly_book/htmlWidgets: http://www.htmlwidgets.org/htmlWidgets Showcase: http://www.htmlwidgets.org/showcase_leaflet.htmlhtmlWidgets gallery: http://gallery.htmlwidgets.org